SemanticScuttle - klotz.me » Tags: llm+machine learning

A Complete Guide to BERT with Code: History, Architecture, Pre-training, and Fine-tuning This bookmark is certified by an admin user.

In this article, we will explore various aspects of BERT, including the landscape at the time of its creation, a detailed breakdown of the model architecture, and writing a task-agnostic fine-tuning pipeline, which we demonstrated using sentiment analysis. Despite being one of the earliest LLMs, BERT has remained relevant even today, and continues to find applications in both research and industry.

2024-05-28 Tags: bert, llm, embedding, google, nlp, encoder-only, transformer by klotz

Training and Finetuning Embedding Models with Sentence Transformers v3 This bookmark is certified by an admin user.

This article explains how to use the Sentence Transformers library to finetune and train embedding models for a variety of applications, such as retrieval augmented generation, semantic search, and semantic textual similarity. It covers the training components, dataset format, loss function, training arguments, evaluators, and trainer.

2024-05-28 Tags: sentence transformers, finetune, embedding, models, similarity, llm, huggingface by klotz

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention This bookmark is certified by an admin user.

This paper introduces Cross-Layer Attention (CLA), an extension of Multi-Query Attention (MQA) and Grouped-Query Attention (GQA) for reducing the size of the key-value cache in transformer-based autoregressive large language models (LLMs). The authors demonstrate that CLA can reduce the cache size by another 2x while maintaining nearly the same accuracy as unmodified MQA, enabling inference with longer sequence lengths and larger batch sizes.

2024-05-26 Tags: transformer, autoregressive language models, key-value cache, attention, multiquery attention, cross-layer attention, machine learning, computer science, llm, mit, csail by klotz

Google launches ‘Model Explorer’, an open source tool for seamless AI model visualization and debugging This bookmark is certified by an admin user.

Google has launched Model Explorer, an open-source tool designed to help users navigate and understand complex neural networks. The tool aims to provide a hierarchical approach to AI model visualization, enabling smooth navigation even for massive models. Model Explorer has already proved valuable in the deployment of large models to resource-constrained platforms and is part of Google's broader ‘AI on the Edge’ initiative.

2024-05-20 Tags: google, llm, machine learning, visualization by klotz

ChatGPT Glossary: 44 AI Terms That Everyone Should Know This bookmark is certified by an admin user.

Stay informed about the latest artificial intelligence (AI) terminology with this comprehensive glossary. From algorithm and AI ethics to generative AI and overfitting, learn the essential AI terms that will help you sound smart over drinks or impress in a job interview.

Researchers test AI systems' ability to solve the New York Times' connections puzzle This bookmark is certified by an admin user.

Researchers from NYU Tandon School of Engineering investigated whether modern natural language processing systems could solve the daily Connections puzzles from The New York Times. The results showed that while all the AI systems could solve some of the puzzles, they struggled overall.

2024-05-15 Tags: connections, puzzle, nyu, nlp, llm, gpt-3.5, gpt-4, bert, roberta, mpnet, minilm, ieee, games by klotz

How to train your large language model: A new technique speeds up the process This bookmark is certified by an admin user.

This article discusses the process of training a large language model (LLM) using reinforcement learning from human feedback (RLHF) and a new alternative method called Direct Preference Optimization (DPO). The article explains how these methods help align the LLM with human expectations and make it more efficient.

2024-05-15 Tags: llm, reinforcement learning, human feedback, openai, chatgpt, rlhf, dpo, training by klotz

Accelerating ML Application Development: Production-ready Airflow Integrations with Critical AI Tools This bookmark is certified by an admin user.

- standardization, governance, simplified troubleshooting, and reusability in ML application development.
- integrations with vector databases and LLM providers to support new applications -
provides tutorials on integrating

2024-05-11 Tags: openai, cohere, weaviate, pgvector, opensearch, apache, airflow, llm, data engineering, machine learning by klotz

A Beginner-Friendly Introduction to LLMs This bookmark is certified by an admin user.

This article provides a beginner-friendly introduction to Large Language Models (LLMs) and explains the key concepts in a clear and organized way.

2024-05-10 Tags: llm, introduction, bert, palm, gpt, llama by klotz

A Total Noob's Introduction to Hugging Face Transformers This bookmark is certified by an admin user.

• A beginner's guide to understanding Hugging Face Transformers, a library that provides access to thousands of pre-trained transformer models for natural language processing, computer vision, and more.
• The guide covers the basics of Hugging Face Transformers, including what it is, how it works, and how to use it with a simple example of running Microsoft's Phi-2 LLM in a notebook
• The guide is designed for non-technical individuals who want to understand open-source machine learning without prior knowledge of Python or machine learning.

2024-05-07 Tags: hugging face transformers, machine learning, natural language processing, computer vision, python library, open-source, transformer models, phi-2, llm, jupyter notebook. by klotz

SemanticScuttle - klotz.me

Tags: llm* + machine learning*

Linked Tags

Related Tags